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bivariate  random  variables  and  let  m(x)=E(Y |Y=x)  be  the  regression  curve  of  X 

J 

on  X .  In  this  paper  we  consideri.the  estimation  of  zeros  and  extrema  of  the 
regression  curve  via  stochastic  approximation  methods.  We  present^consistencv 
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I.  Introduction 


Let  (X,  V),  (Xi,  Yt),  (A'2,  K2), ...  be  a  sequence  of  independent,  identically  distributed,  bi¬ 
variate  random  variables  with  joint  probability  density  function  f(i.,y).  In  t  his  paper  we  consider 
the  sequential  estimation  of  zeros  and  extrema  of  tn(x)  —  l'J(Y |A'  =  x)  using  a  combination  of 
the  nonparametric  kernel  and  stocii.'istic  approximation  methods.  The  structure  of  our  sampling 
scheme  is  different  from  the  one  considered  by  liohhins  and  Monro  ( 1951 )  since  the  experimenter, 
observing  the  bivariate  data,  has  no  control  over  the  design  variables  {A',},  as  is  assumed  in 
classical  stochastic  approximation  algorithms. 

The  proposed  sequential  procedure  is  based  on  the  principal  idea  of  nonparametric  kernel 
estimation  of  m(z),  i.e.  to  construct  a  weighted  average  of  those  observations  { A-, ,  V, )  of  which 
A',  happens  to  fall  into  an  asymptotically  shrinking  neighborhood  of  x.  The  shrinkage  of  such  a 
neighborhood  is  usually  parameterized  by  a  sequence  of  luuidwhlths  h„  tending  to  zero,  whereas 
the  shape  of  the  neighborhoods  is  given  by  a  real  kernel  fund1  >11  K. 

Motivated  by  classical  procedures  we  define  the  following  sequential  estimator  of  a  zero  of 
m, 

(1)  -  anh~  lK((Z„  -  A'n)//in)V„,  n  >  1. 

Here  Z\  denotes  an  arbitrary  starting  random  variable  with  finite  second  moment  and  {a„}  is 
a  sequence  of  positive  constants  tending  to  zero.  I11  fact,  the  sequence  { Zn }  will  converge  under 
our  conditions  to  the  (unique)  zero  of 

”»(*)  =  J  yf(x,y)dy  =  m(x)fx(x), 

where  fx(x)  denotes  the  marginal  density  of  A’,  but  an  assumption  about  fx  ensures  that  the 
zero  of  the  two  functions  m  and  m  is  identical. 

Under  mild  conditions  we  show  consistency  (almost  surely  and  in  quadratic  mean)  and 
asymptotic  normality  of  { Zn }.  An  asymptotic  bias  term  (depending  on  the  smoothness  of  m) 
shows  up,  if  the  bandwidth  sequence  tends  to  zero  at  a  specific  rate.  Fixed  width  confidence 
intervals  are  constructed,  using  a  suitable  stopping  rule  based  on  estimates  of  the  variance  of 
the  asymptotic  normal  distribution. 

Our  arguments  can  be  extended  to  the  problem  of  estimating  extremal  values  of  the  regres¬ 
sion  function  m.  Note  that  m  =  m/ fx  Mid  therefore  in'  =f/fx,  where 

~(x)  =  fx(x)  J y-^f(x,y)<ly  -  m(x)-^-/x(x). 

Under  a  suitable  assumption  the  problem  of  finding  an  extremum  of  m  is  equivalent  to  finding 
.1  unique)  zero  of  the  function  r.  So  it  is  reasonable  to  apply  a  procedure  similar  to  (1).  Addi¬ 
tional  difficulties  turn  up  since  fx  has  to  be  estimated  separately.  We  propose  to  perforin  the 
estimation  by  an  additional  i.i.d.  sequence  {A',}  with  the  same  distribution  as  A'.  Define 

1 


(2) 


hi 


K+i  =  K-  «nfc;3A-((^  -  Xn)/hn)K'((Z‘n  -  Xn)/hK)Yn 

+anh~*K,({Z'n  -  Xn)/hn)K{(Z'n  -  A'„ «  >  1 

We  shall  prove  that  {Z'n}  is  consistent  and  asymptotically  normally  distributed.  Fixed 
width  conlhlence  intervals  Jire  computed  by  the  same  technique  ;ia  for  {/?,,}. 

If  we  knew  f\-  the  algorithm  (2)  would  simplify,  the  additional  {A\}  are  obsolete  in  this 
case,  here  we  propose 

(3)  z,'l+ ,  =  -z;  -  anh~2 K\{Z'n  -  Xn)/hn)Ynfx[Z'n) 

+anh-lK((Z'fl  -  Xn)/hn)Ynf‘x(Z'n),  n  >  1. 

The  additional  difficulty  of  estimating  simultaneously  /.y  didn’t  occur  in  the  case  of  esti¬ 
mating  zeros,  since  the  problem  for  m  could  lie  transferred  to  the  equivalent  problem  for  m, 
which  does  not  involve  /_ y.  In  practice  the  additional  i.i.d.  sequence  {A,}  could  be  constructed 
by  sampling  in  pairs  and  discarding  the  V  observations  of  one  element.  This  results  in  some 
loss  of  efficiency  but  makes  the  practical  application  possible  with  the  data  at  hand.  Another 
proposal  that  we  would  like  to  make  is  related  to  the  boot-strap.  From  the  first  A'  observations, 
a  density  estimate  /. y  of  f\  could  be  constructed  and  then  the  algorithm  (2)  could  be  started 
with  {.Y,}  distributed  with  density  fx ■  A  third  possibility  is  to  plug  in  /; y  into  the  algorithm 
(3).  We  did  not  investigate  the  last  mentioned  procedures. 

An  alternative  way  of  defining  an  estimator  of  the  zero  of  the  regression  function  m  could  be 
to  construct  an  estimate  of  the  whole  function  and  then  to  empirically  determine  an  observed  zero 
.is  an  estimate.  This  procedure  won  Id  be  time  consuming  in  the  case  of  sequential  observation 
of  the  date,  since  for  every  new  observation  the  whole  function  has  to  be  constructed  whereas 
our  procedure  just  keeps  one  number  in  memory  and  updates  that  number  due  to  the  formal 
presription  (1).  Also  in  cases  where  an  enormous  amount  of  data  has  to  be  processed,  an  estimate 
of  a  zero  based  on  the  estimate  of  the  whole  regression  function  seems  to  be  inadvisable  since 
all  the  data  has  to  be  stored  in  the  memory  at  a  time. 

Related  work  was  done  by  Revesz  (1977)  and  Rutkowski  (1981,  1982)  who  applied  stochastic 
approximation  methods  to  the  estimation  of  m  at  a  lixed  point.  Our  derivation  of  fixed  width 
Diffidence  intervals  was  inspired  by  the  papers  of  Chow  and  Robbins  (1965),  McLeish  (1976) 
■vi  i  Stun*  (1983).  The  author  last  mentioned  used  in  the  field  of  density  estimation  the  kerne) 
estimation  technique  that  introduces  a  localizing  effect  which  makes  classical  methods,  sucli  as 
Neuter’s  (1966),  applicable. 

file  rest  of  the  paper  is  organized  as  follows.  Section  2  contains  the  results  and  gives 
the  consistency  proof  for  {Zn}.  In  section  3  we  present  the  results  of  some  simulations  and  an 
appli<  ation  of  of  {/?„}  to  some  real  data,  ill  the  last  section  we  give  the  rest  of  the  proofs. 
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2.  Results 

A  crucial  assumption  that  makes  the  problem  identifiable  through  m  [  reap,  r]  is  the 
following. 

(Al)  The  marginal  density  fx  of  X  is  positive. 

The  speed  of  convergence  of  {an}  and  {h„}  is  controlled  by 


(*2.1) 


qn  =  00  ,  anh„ 


M  2.3) 


The  zero  ©0  of  m(x)  (and  of  m(x))  is  identified  by 
(A3)  inf  (z  -  ©o)m(z)  >  O  for  all  <  >  0. 

Smoothness  of  m  is  guaranteed  by 
(A4.1)  m  is  Lipschitz  continuous; 

(A4.2)  m  is  differentiable  in  a  neighborhood  of  ©0  such  that 

m'(©0)  >  1/4; 

(A1.3)  in  is  twice  continuously  differentiable. 

The  kernel  function  K  has  to  satisfy  the  following  conditions. 

( A  5 . 1 )  A'  is  bounded  and 


J  K[u)du=\,j  uK(u)du=0,J  u2K(u)du 


(Afi.2)  K  is  differentiable  and 


lim  |u/f(u)|  =  0,  [  |u|  K  2(u)du 
u|--oo  J 


.S 

*  a 


(A5.3)  K  is  twice  differentiable  and 


lim 

l«l  .00 


|uA"(u)| 


-0./ 


|u|  K  2(u)du  <  00. 


The  joint  density  /(x,y)  has  to  be  .smooth  in  its  first  argument. 

(AG. I )  j/(x, y)  -  f(:,y)\  <  |x  -  2|yi(y)  such  that  J(y2  +  l)yi(y)dy  <  00. 

(AG. 2)  J77r/(x,y)  is  continuous  and 

I £/(«.!/)  -  £/(*>.y)|  <  |«  —  »| £T3(y) 

such  that  /(|y|  +  l)y2(y)dy  <  00. 

Moment  assumptions  arc 

(.•17)  EY*  <  00  and  »upzE(Y2\X  =  x). 

We  have  split  up  the  assumptions  into  several  subparts  since  we  will  use  the  subparts 
separately.  The  consistency  of  {Zn}  is  shown  in 

Theorem  1.  AssumoMl).  ( A2.1).  M2.2UA3).  (A4.1),  M5.1),  (Al).  Then  {/„}  converges  to 
0o  almost  surely  and  in  the  quadratic  mean. 

Since  the  proof  of  this  theorem  is  very  simple  and  exemplifies  the  combination  of  the  kernel 
method  together  with  stochastic  approximation  arguments  we  would  like  to  give  it  here.  The 
proofs  of  the  following  results  arc  delayed  to  section  4. 

Write 

Zri  +  1  =  Zn  ( lnTTl(Zn )  -f-  anTn 
Vn  =  m(Zn)  +  h\(Zn  -  A'„)r„ 
where  Kit(u)  =  *  “ 1  K(v/hn). 

I.et  rn  =  o{7,\<  Zz, . . .  ,Zn).  Condition  (A4.1)  implies  that 

A'(^n|5«)=0(hn)  a. a. 

E(V2)  =  0(E(Zn  -  B0)2)  +  0(h~2). 

' ierve  that  with  (A3)  and  a  Lipschitz  constant 

(/...I  -  Bo)2  =  (Zn  -  Bo)2  -  2a„m2(Zn)(Zn  -  0O) 

+  <x2m2(Z„)  +  2 anVn(Zn  -  0O  -  a„m(Z„)) 

+ 

<  (1  +  *V‘%)(Zn  ~  B0)2  +  «*vt2 

+  2anV'n(Z„  -  Bo  -  anm(Zn)). 
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Hence  by  (A7), 


£((Zn+1  -  e0)s|S„)  <  (1  +  alLl)(Zn  -  0„)2 

+  +  anLm)  \Z„  -  0»| 

+  KV(K:l  Bn) 

<  (!+/<„)(/„  -  0<»)2  +  K 

where 

0n  -0{h~7al  +  /»,«„+  a2) 
in  =  0(h„an  +  h  ~2a2), 

if  we  use  the  simple  inequalities 

I Zn  -  e0|  <  i  +  |zn  -  e«|2 

£(V*|S„)  <  2m2(Z„)  +0(h'  2)«uPE(Y2\X  =  t) 

X 

Note  that  by  (A2.1),  (A2.2)  £/?n;£<5„  <  oo. 

The  assertion  follows  now  from  Venter  (1906).  Theorem  1.  Nixdorf  (1982),  Theorem  1.1.2  has 
given  a  corrected  version. 

The  asymptotic  normality  is  shown  in 

Theorem  2.  Assume  (.-11),  (.43),  (A4.2),  (A-1.3),  (A5.1 ). 

Lei  a„  =  «-1,  hn  =  n'1, 1/5  <  7  <  1/2. 

Then 

nV{Zn  _  0O}  —  N(6(t f),<r2(-]f)) 

where 

i(7)  =  0  if  1/5  <7  <1/2 

=  m"(0o)  y"  «2/f(u)<i«/(2m'(0o)  -  1  +  7)  if  -7  —  1/5; 

=  J  K2  f  y2/(©o,y)dy/(2rh'(0o)  -1+7) 

Fixed  width  asymptotic  confidence  intervals  for  the  unknown  parameter  0<j  are  constructed  via 
—■ timators  of  b( 7)  and  tr2( 7). 

Estimators  of  / y2/(0o, y)<fy,  m'(0o), m  (0q)  are 


^ ^ "*■-*-»«■■  w* w*  v* U^V'i  ITWir* vw vim  JIWW  i 


(4)  5ln  =n-1^Kfc>(Z,  -  A',)r,2, 

•  =-l 


(5)  ^2n  =n-‘^A';.  (Z,  -X,)Ktl 

1-1 

n 

^3«  =  n -‘^<(4-^)1;, 
•  + 1 

respectively. 

An  estimator  for  the  asymptotic  variance  cr2 (-> )  is  therefore 


sn  =  J  A'2Sln/(2S-.„  -  1  +  if)  if  2.V,U  -  !  +  t  >  0, 


-  0 


rise 


So  the  following  stopping  rule  seems  reasonable. 

(6)  N(d)  =  in f{n  €  W  |  *„  +  n-1  <  n‘-»d2/z<i/2}. 

where  rQ/j  is  the  (1  -  a/2)-  <iuantilc  of  the  standard  normal  distribution. 

The  fixed  width  confidence  intervals  are  constructed  via 

Theorem  3.  Let  «n  =  n~  *,  hn  =  n-1, 1/5  <  7  <  1/3  and  assume  (Al),  (A3),  (A4.2),  (A4.3), 

( A5. 1),  ( A5.2).  Then  if  A'(<i)  is  defined  as  in  (6)  for  some  0  <  a  <  1,  as  d  — *  0 

N(d)^{ZN{d)-e o}  -r  N(b(7W(i)). 

In  the  case  1/5  <  7  <  1/3  an  asymptotic  confidence  interval  of  fixed  length  2d  and  asymp¬ 
totic  coverage  probability  1  —  a  is  given  by 

I %N{d)  ~  d,  ZN(d)  +  <*). 

For  7  =  1/5  the  bias  can  be  estimated  by 

l>n  =  J  U2  K(u)duSi„ /(2$2n  “  1  +**)■ 

Then  with  IIn  —  Zn  -  n-*1"*  bn  an  asymptotic  confidence  interval  is  given  by 
I H\{d)  ~  H N(d)  +  d|. 


0 


y  v 


Remark  1.  The  range  of  -7  had  to  be  reduced  to  1/5  <  7  <  1/3  since  otherwise  Sjn 
would  no  longer  be  a  consistent  estimator  of  m'(0o)- 

Remark  2.  It  will  be  sewn  in  the  proof  of  Theorem  3  that,  as  d  —  0 ,N(d)/b(d)  —  1 
almost  surely  where  b(d)  =  in/{n  €  IN  |  <t2(7)  <  »«'  ~,d~/z-^J}.  Therefore  Af(</)  exhibits  tin; 
following  limit  behavior, 

as  <i  — »  0, 

The  analysis  of  the  sequential  procedure  {Zn }  is  quite  analogeous  to  that  of  {Zn}.  we  define 
the  (unique)  zero  of  f  as  0*/- 

Theorem  4.  Assume  (Al),  (A2.1),  (A2.3),  (A5.1),  (A5.3),  (A6.1),  (A7)  ami  let  (A3),(A4.1)  be 
fulfilled  with  f  in  the  place  of  m.  Then  { Zn }  converges  to  0A<  almost  surely  and  in  the 


quadratic  mean. 

The  next  theorem  gives  the  asymptotic  normality  of  {Zn}. 

Theorem  5.  Let  «n  =  n-1  and  hn  =  r»~'1, 1/G  <  7  <  1/5  then  under  (.41),  (.45.1),  (A5.3), 
(AG. 2),  (A7)  and  (A3),  (A4.2),  with  f  in  the  place  of  m. 

Then 

n^{<-0A,}-*£  N( 0,<t2,(7)), 

where 

o2Mh)  =  /*(©„)  I  y2f(eM,y)dy  J  K2  J  (A")2/(2f'(0„)  -  1  +  47) 


Remark  3.  For  simplicity  of  presentation  we  didn’t  arrange  for  a  wider  range  of  7  such 
that  an  asymptotic  bias  term  occurs.  If  f  is  twice  continuously  differentiable  then  the  range  of 
allowable  exponents  can  be  extended  to  1/8  <  7  <  1/4.  The  discussion  would  be  in  analogy  to 
Theorem  2  with  f  in  the  place  of  m. 

Estimators  for  the  numerator  and  denominator  of  <7^(7)  are  constructed  in  the  following 

way. 

n  n 

S\n  =  r»“l  J2  Kh,(Z,  ~  A \>-1  Y,  Kh,(Zx  -  Xx)Y? 

t=l  «=1 

is  an  estimator  for  /x(0Af)  /  y2/(©m,  y)d’J,  whereas 

n  n. 

•<„  =«“*  ]T  -  A\)n-  1  Y  <.(4  -  *.)>', 

I- 1 

n  n 

-n-1  2  Y  K'hSZ*  -  A'.) 


converges  under  our  assumptions  to  f'/0M ,  almost  surely.  Define 

w  =  j  K2  J  (K')2S'XJ(2S'2„  -1+4-,) 

/V  (//)  =  *n/{n  €  IN  j  -va/  +  n"  1  <  n1  2',dlZ„/1 }. 

Then  parallel  to  Theorem  3  we  have 

Theorem  C.  Let  a „  =  n~‘  and  /i„  -  n"1.  1/G  <  7  <  1/5.  and  let  the  conditions  of  Theorem  5 
he  fulfilled.  Then,  ;us  <1  —  0 

N\d)l^2'^{Z‘N.w  -Bm}  A'(0.a;,(7)). 
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3.  Monte  Carlo  Study  and  an  Application 

In  this  section  we  report  the  results  of  a  Monte  Carlo  experiment  comparing  the  performance 
of  our  sequential  procedure  when  some  of  the  involved  parameters  are  tuned  at  different  levels. 
We  also  report  an  application  of  the  algorithm  (1)  to  some  real  data. 

The  basic  experiment  to  assess  the  accuracy  of  Theorem  3  consisted  of  200  Monte  Carlo 
replications  with  the  numbers  N(d),Zs(,i)  and  to  be  reported.  The  joint  probability  den¬ 

sity  function  /(r,y)  that  we  used  was  f(x,y)  =  /[o,i](r ~  in(x))/n.),<p  the  probability 
density  function  of  a  standard  normal  distribution  and  m(x)  =  -<i{(l  -  x)‘~  -  1/4}  for  a  =  4,8 
was  the  regression  curve.  We  report  the  result  for  Z\  =  0.45  (Table  1)  and  for  Z,  =  0.2  (Table 
2).  The  parameter  n  w;is  set  to  o  =  0.05.  The  zero  that  was  to  he  estimated  was  Bo  =  1/2 
and  two  different  values  of  d  and  a.  were  fixed,  namely  d  =  0.05,0.1  and  o.  =  0.1, 1.0.  As  the 
kernel  K  we  have  chosen  the  Epanechnikov  kernel  K (u)  —  3/4(1  -u2)  for  |u|  <  1  and  K(u)  0 
for  | u |  >  1.  the  sequence  of  bandwidths  was  set  to  h  —  hn  =  rt  ’ ,  -y  =  0.21.  In  Table  1  the 
results  for  the  starting  point  Z\  =  0.45  are  shown.  The  figures  of  Table  1  indicate  that  the  fixed 
accuracy  result 

Table  1  about  1  irre 


given  in  Theorem  3  yields  a  good  approximation  of  Bo  even  for  d  =  0.1.  This  is  seen  from  the 
counts  in  the  Z,v(j j—  column.  It  is  indicated  there  how  many  times  (from  200  Monte  Carlo 
trials)  the  true  parameter  B0  =  1/2  was  in  the  confidence  interval  ( Zs(,i)  ~  d, ZS(d)  +  d],  As 
a  measure  of  spread  we  added  the  quantiles  QK  and  Qs  in  the  third  and  fourth  column  of 
each  entry.  A  small  paradox  occurs  when  we  compare  the  figures  for  different  values  of  a.  It 
is  expected  that  the  procedure  (1)  stops  earlier  with  a  —  8  than  with  a  =  4,  since  the  higher 
derivative  in  the  zero  should  speed  up  the  convergence  of  {Zn}  to  B0.  In  both  Table  1  and  Table 
2  it  is  seen  that  the  average  of  the  stopping  times 

Table  2  about  here 


>V'T  200  Monte  Carlo  runs)  is  considerably  higher  for  a  —  8  and  a.  =0.1  than  for  a  =  4  and 
n.  -  0  1  This  effect  is  due  to  the  crude  approximation  var(K|A'  =  x)~  a2,  x~  B0,  as  can  be  seen 
fr  en  tiie  figures  for  S;v(<i)-  ln  the  case  of  a  =  8  the  statistic  considerably  overestimates 

the  true  asymptotic  variance  <r(- 7).  For  comparison  we  list  some  correct  0(7)  =  <7(0.,  a,  7).  For 
instance,  <7(0.1,4,0.21)  =  0.00083  whereas  <7(0.1,8,0.21)  =  0.00039. 

In  a  small  application  we  took  the  sequence  of  random  variables  {(A",,  K,)},  A',  =age,  V,  =weight 


of  female  corpses)  which  was  gathered  from  19G9  to  1981  by  the  Institute  of  Forensic  Medicine 
of  Heidelberg.  It  is  an  interesting  question  in  forensic  medicine  to  estimate  the  mean  age  from 
the  weight  of  unknown  corpses.  We  restricted  our  attention  to  the  ages  between  0  and  20  years 
in  order  to  fulfill  assumption  (>13).  We  put  mo  =  >10  kg,  and  we  applied  the  procedure  (1)  and 
ended  witli  different  starting  values  Z\  at  Zs(d)  =  11.0  years  and  N(d )  =  SG3,  for  d  =  0.1  and 
N(d)  =  224  for  d  —  0.2(Zi  =  0.4).  A  plot  of  the  first  732  data  pairs,  restricted  to  ages  between 
0  and  20  years,  should  illustrate  the  accuracy  of  Zpj(j)  (Figure  1). 
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4.  Proofs 

The  theorems  are  proved  by  a  functional  central  limit  theorem  given  by  Berger  (1980),  who 
extended  a  result  of  Walk  (1977),  that  made  it  applicable  in  our  setting.  Lemma  1  describes 
the  asymptotic  behavior  of 

(7)  tV'n(f)  =  n  '/2/fjBtj  +  «-,/2(«l  -  |r.t',){ 0  <  t  <  1, 


«k  =  jfc,'2|fcir1(Zt-«o)-fc('y)|,  *  €  in. 

Lemma  1.  Let  the  conditions  of  Theorem  3  be  satisfied,  then  lVB(t),  as  deli  tied  in  (7)  converges 
weakly  in  0(0,  Ij  to  the  Gaussian  process 

(8)  <7,(0=  j  K-  J  »2/(«0,y)rfy  J  «"‘'<H'*>-<2^)/2dVV(tu),  ()<t<l, 

where  IV  is  the  standard  Wiener  process  starting  at  0. 


Proof.  Deli 


Bn  =  a{Zu...,Zn } 

+  1  -  0o  =  (1  -  DJn)(Zn  -  0o )  +  n^Vn  +  nzV^1Tn 


where 


Vn  =  h-l^E{K(Zn-h-—)Yn\Bn)  -  /i->/2/f(^-/t  — )Kn, 

Tn  =  n^{m(ZB)  -  E[Kh(Zn  -  Xn)rn|SB|} 

and  {Dn}  is  a  sequence  of  random  variables  converging  almost  surely  to  m'(0o)  such  that 
Dn{Zn  -  0O)  =  m[Zn).  Such  a  sequence  exists  because  m  is  differentiable  in  0o  and  ZB  — *  0O 
almost  surely  by  Theorem  1.  The  assumption  on  a„  and  hB  imply  that 
Tn  — »  1/2 /  ti2/f  (u)ilurh"(&o).  Note  that  £’(Vn|Sn)  =  0  and  that  by  (A7)  and  (,46.1), 

£(VB|Sn)  —  J  K2  j  yV(0o.y)<fy.  almost  surely  , 

E(V‘)  =  0(1). 

Furthermore  we  have  for  all  r  >  0 

E(V2l(V2  >  rn(5B)  <  0(h-2)/>(Vn2  >  rn(SB) 

<  0(h~!n~I )  =  o(l)  almost  surely. 


:av,.v 


The  lemma  follows  now  from  the  generalization  of  a  theorem  of  Walk  (1977)  ,  given  by 
Berger  (1980). 

□ 

The  following  lemma  gives  an  analogeous  result  for  the  Kiefer  Wolfowitz  type  sequence 
{/,',}  defined  in  (2) 

Lemma  2.  Let  the  conditions  of  Theorem  G  be  satisfied.  Define  lV„(t)  as  in  lemma  1  but  with 
f  in  the  place  of  m  and 

nk  =  kl'2k^(zk  -eM). 

Then  converges  weakly  in  C[ 0,  Ij  to  the  Gaussian  process 

Cj(t)  =  /.v(«m)  j  y‘f(SM,y)dy  J  A2  j (A')2  j  <*«Mt-a/2 )jw(tu),  0  <  t  <  1. 

Proof  of  Theorem  2.  Use  Lemma  1  and  evaluate  (Jj(t)  at  t  —  1. 

Prooof  of  Theorem  3.  Define  the  scejuence 

b(d)  =  inf{n  G  IN  |  *2ft)  <  n^d2 /'ln} 

The  estimators  Si„,  defined  in  (4),  (5)  converge  to  f  j/2/(0 u,y)dy,  7n'(0o)  respectively.  This 
entails  that,  as  d  — >  0, 

A ’(d)/b(d)  — »  1  almost  surely. 

Now  apply  Lemma  1. 

Proof  of  Theorem  4.  Like  the  proof  of  Theorem  1. 

Proof  of  Theorem  5.  Use  Lemma  2  and  evaluate  (?2(f)  at  t  =  1. 

Proof  of  Theorem  6.  Similar  to  the  proof  of  Theorem  3. 
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